![]() Acrobat file (180K) |
![]() ClarisWorks 4 file (53K) |
![]() QuickView file (284K) |
Technote 1018 | FEBRUARY 1996 |
Every attempt has been made to ensure compatibility with the classic Macintosh serial driver which does not support DMA operation. While some minor behavioral differences exist, the primary differences are:
You should be familiar with Macintosh device drivers in general, and with the
classic Macintosh serial driver in particular. For reference, see Inside
Macintosh: Devices, chapters 1 and 7.
Contents
The original version of SerialDMA is identifiable as Serial Driver version 8. As Apple and several third-parties have discovered, version 8 may simply be unsuitable for some categories of applications.
In response, Apple is now providing a second-generation SerialDMA driver (version 9), which corrects the design flaws in the first version, increases compatibility with the classic serial driver to the greatest extent possible, and optimizes performance to realize the full potential of the available DMA hardware. The effects of this re-architecture include
On the other hand, completion routines are now called with interrupts masked rather than at deferred task time, which eliminates one source of compatibility problems with ill-behaved client software.
During an asynchronous I/O request, driver clients cannot depend on the parameter block's ioActCount field to increment because the processor does not intervene in the transfer.
The second-generation driver implements sophisticated DMA channel management to arrive at a better compromise between the demands of responsiveness and latency tolerance. Responsiveness should now be indistinguishable from the classic serial driver. Latency tolerance should still provide ample margins.
Receive bandwidth has received special attention. Receive DMA channel availalability is maintained at the highest possible level consistent with responsive behavior toward driver clients. While exact channel availability characteristics depend on the specifics of the DMA controller, the possibility is very small that the DMA controller will exhaust its transfer count and allow the SCC to overrun.
The bandwidth of memory and typical system interrupt latencies provide for support of the maximum 230.4K bps data stream, provided that exceptional events do not deplete the available DMA resources and provided the client manipulates the driver in an efficient manner. New driver Control codes make 115.2K and 230.4K bps modes easily accessible to client software without difficult or risky hacks and workarounds.
For development purposes, the easiest and quickest way to check the serial driver version installed on a system is with the MacsBug drvr dcmd. This dcmd displays all the drivers installed on a system with their respective version numbers (if specified).
Please read the section on Additional Details for an explanation of how the client may affect the performance and the potential throughput of the SerialDMA driver.
This driver may be identified in one of two ways. The client may make a Status call to the serial driver with a csCode of 9 to retrieve the driver's version. The first-generation driver returns the value 8 in the first byte of csParam. Alternatively, the client may issue a Control call with a csCode of 17987 and inspect the result; no other version of the serial driver implements this csCode and therefore only this version returns a result of noErr (other versions return controlErr by default). This special Control call, which was unique to the first-generation SerialDMA driver, allowed customization of the DMA timeout latency between one and 65,535 ticks. The latency must be specified in the first 16-bit csParam word when issuing the Control call; a value of 1 is safe and provides best performance although it is somewhat inefficient.
The second-generation SerialDMA driver is the first serial driver to support csCodes to invoke 115.2K bps mode and 230.4K bps mode, so in some sense the driver version could be detected that way, but in the future, other driver versions may also support these calls. Therefore, if these functions are desired, make the Control calls without regard to the driver version.
The SerialDMA driver supports two new csCodes, 115 and 230, by which its Control routine can switch the driver to high-speed modes. These csCodes support 115.2K baud and 230.4K baud rates. The correct time to make these calls is after a normal SerReset call using some other (lower) baud rate. The reason for this is SerReset performs a number of configuration tasks but assumes that the baud rate is a function of the SCC baud rate generator. In order to achieve these two higher speeds, the baud rate generator must be bypassed, using the standard 3.672 MHz SCC clock and one of a very limited number of rate divisors. These csCodes effect the task of bypassing the baud rate generator and setting the clock divisor to achieve the specified rate.
csCode = 15 csParam = byteThis call is designed to place the serial driver into a quasi-MIDI mode. It is similar to a SerReset, but it always leaves the serial driver in a mode of eight data bits and one stop bit. Hardware handshaking is disabled. Clocking is required externally at the CTS pin. The parameter byte represents the factor by which the external clock frequency exceeds the data rate according to the following table. The rate multiplier is encoded in the two most significant bits of the parameter byte, while all other bits are reserved and should be zero.
Encoding Rate multiplier Example 0x00 x 1 0x40 x 16 250K baud 8x MIDI rate @ 4 MHz 0x80 x 32 125K baud 4x MIDI rate @ 4 MHz 0xC0 x 64 31.25K baud standard MIDI rate @ 2 MHz
csCode = 115This call is designed for high-speed modems. It typically requires DMA hardware on the receive channel to be successful. It is similar to the clock selection function available through csCode = 16, but it instead forces the serial driver to take its baud rate clock directly from the internal 3.672 MHz RTxC clock source with a rate multiplier of 32. The result is to force transmit and receive baud rates of nominally 115.2K baud. Other configuration parameters are not affected.
csCode = 230This call is designed for high-speed modems. It typically requires DMA hardware on the receive channel to be successful. It is similar to the clock selection function available through csCode = 16, but it instead forces the serial driver to take its baud rate clock directly from the internal 3.672 MHz RTxC clock source with a rate multiplier of 16. The result is to force transmit and receive baud rates of nominally 230.4K baud. Other configuration parameters are not affected.
No attempt has been made to abstract the Z8530 Serial Communications Controller hardware. The serial driver is intimately tied to this piece of legacy hardware. However, the bulk of the driver is relatively independent of the details of any specific DMA controller. A handful of primitive vectors are installed when the driver is opened and all DMA operations are handled in a device-independent manner by the main part of the SerialDMA driver.
Just to give an idea of the flexibility of the driver with respect to DMA models, the DMA controller in the Quadra 840av requires a pair of user-defined, linear DMA buffers of arbitrary size and automatically ping-pongs between them when the transfer count on each buffer goes to zero. The Power Macintosh 8100 contains a single system-defined, circular DMA buffer which interrupts when the transfer count goes to zero and then optionally continues transferring characters even while the interrupt awaits processing. The Power Macintosh 9500 uses a semi-intelligent DMA command processor which supports an arbitrary number of buffers. The SerialDMA driver supports DMA models with ease, using only eight brief, abstract primitives and three or four interrupt handlers unique to each DMA controller.
It is understood that the Mixed Mode switches in the native PowerPC SerialDMA driver incur costly overhead, but it is thought that native mode execution of certain critical code sequences when DMA is suspended, or when delivering large packets of data with interrupts disabled, provides an overall win for the driver under the most challenging performance conditions.
The native version of the driver is packaged as a native code resource of type 'nsrd' with Mixed Mode calling conventions identical to that of the 'SERD' resource. The 'SERD' resource is for 68K machines only. The 'nsrd' resource is for PowerPC machines only.
The DMA engine offloads responsibility from the CPU for moving data between the SCC and memory. In the best case, data throughput is limited not by system interrupt latency but by the bandwidth of the memory system, which is usually much higher than the rates achievable by common serial I/O hardware. The DMA hardware generates interrupts only after a previously specified number of characters have been transferred. Comparing this to the non-DMA model, it is intuitive that the benefit of DMA transfer increases approximately linearly with the average size of the DMA transfer count. Every character transferred without processor intervention saves valuable processor time and improves response time for other interrupt-driven processes.
One factor which reduces the benefit of DMA is the increased complexity of the DMA interrupt handler relative to the SCC single-character I/O interrupt handler. It is important to overcome this brake on performance by taking advantage of the DMA benefits described earlier. In general, it should only be necessary to average a few characters per DMA block to overcome the increased overhead of a generalized DMA block handler versus a single-character handler. However, it should be understood that it is possible to operate the DMA serial driver in ways that do not take advantage of DMA and incur even greater overhead than the traditional interrupt-driven serial driver. Every attempt should be made to avoid such inefficiencies when performance is critical or data throughput is high by traditional Macintosh serial I/O standards.
As previously stated, the DMA serial driver causes the hardware to generate interrupts upon the completion of a DMA block transfer. It also responds to status interrupts in exactly the same manner as the traditional Macintosh serial driver, so each time the state of the CTS input changes or a break condition is detected, an interrupt must be serviced. Furthermore, various serial driver API calls invoke the drivers interrupt handler in order to synchronize the DMA engine with pending I/O requests, handshaking thresholds, and so on. These implicit interrupts which occur as a result of API calls are required to approximate the responsiveness to certain events which are supported by the Macintosh serial driver API.
XOn/XOff output handshaking is extraordinarily expensive to support within a DMA serial driver. The reason for this is that in order to guarantee acceptable response times to the reception of XOn and XOff characters, the driver must suspend most of the benefits of DMA and interrupt on every single received character, just like the non-DMA serial driver. Only it is worse because the DMA interrupt handler is more complex and time consuming than a standard receive character interrupt handler. Nevertheless, since the DMA driver cannot support old-fashioned pollprocs for immunity to interrupt latency, use of the DMA interrupt handler on every character is the only viable recourse. As a result, XOn/XOff handshaking is not recommended at high data rates (as a rule of thumb, 57,600 bps is probably too fast for efficient software handshaking).
It is not quite as punishing, but any type of input handshaking may put a limit on the DMA transfer count (and therefore available DMA resources) because it is necessary to generate an interrupt and assert flow control when the buffer threshold is reached.
Regardless of the initially programmed DMA transfer count, special receive conditions must terminate active DMA transfers (again, limiting the available DMA resources) in order to support extraction of corrupted characters from the data stream in accordance with the serial driver specification. Reception of break sequences may also temporarily limit DMA resources, or even suspend receive channel DMA during the break assertion.
Because DMA transfer counts are dependent upon the handshaking mode, buffer size, read request size, and other factors, numerous Control and Status calls require that DMA be stopped temporarily and restarted with new parameters. This involves some overhead which can be avoided frequently through more sophisticated use of the serial driver API. For example, rather than polling SerGetBuf frequently, it is much more efficient to make a single asynchronous Read request for some expected amount of data, and time out the request with a KillIO sometime later if it is not forthcoming. There is no overhead whatsoever to leave a pending read request while no data are being transferred by the SCC, but there is a great deal of overhead polling SerGetBuf in a loop. This is true of both DMA and non-DMA serial drivers.
Another performance-killer is the commonly-used small, chained read algorithm. The purpose of DMA is to stream relatively large quantities of data at high speed, and posing frequent, one-character read requests is very counterproductive. Each time a read completes, an interrupt must occur. Also, each time a read is issued, an implicit interrupt results to synchronize the DMA engine with the clients request count. The key to sustaining high data rates is in limiting the number of system interrupts and reaping some economy of scale in block data processing.